STA4173 Lecture 2, Summer 2023
In the last lecture, we focused on describing data.
Today, we will focus on drawing conclusions about populations using data.
Point estimate: The single value of a statistic that estimates the value of a parameter.
It is necessary to know how good our estimation is, or to quantify our uncertainty.
Confidence interval (CI): A range of plausible values for the parameter based on values observed in the sample.
Level of confidence: Denoted by (1-\alpha)100\%, the expected proportion of intervals that will contain the parameter if a large number of different samples is obtained.
e.g., 95% CI:
\alpha=0.05
If we draw 100 samples, we expect 95 of the CIs to contain the true value of the parameter.
e.g., 90% CI:
\alpha=0.10
If we draw 100 samples, we expect 90 of the CIs to contain the true value of the parameter.
The point estimate corresponds to the parameter we are estimating.
If we are estimating \mu, \bar{x} is the point estimate.
If we are estimating p, \hat{p} is the point estimate.
The margin of error is critical value \times standard error.
The critical value will come from either the z or t distribution and depends on the level of confidence.
The standard error corresponds to the point estimate.
where
lower bound = point estimate – margin of error
upper bound = point estimate + margin of error
Make sure to state your confidence intervals in numeric order.
\bar{x} \pm t_{\alpha/2, n-1} \frac{s}{\sqrt{n}}
The critical value is t_{\alpha/2,n-1}.
Recall the computation of \bar{x},
\bar{x} = \frac{\sum_{i=1}^n x_i}{n}
s = \sqrt{\frac{\sum_{i=1}^n x_i^2 - \frac{(\sum_{i=1}^n x_i)^2}{n}}{n-1}}
t.test() function to find confidence intervals for \mu.Recall the Motor Trends car road tests data, built into R.
The data was extracted from the 1974 Motor Trend magazine, and includes aspects of car design and performance for 32 cars (1973-74 models).
One Sample t-test
data: mtcars$mpg
t = 18.857, df = 31, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
17.91768 22.26357
sample estimates:
mean of x
20.09062
\bar{y} = 20.09
The 95% CI for \mu is (17.92, 22.26).
If the 95% CI for \mu is (17.92, 22.26),
Is the average gas mileage less than 25 mpg?
Is the average gas mileage greater than 20 mpg?
If the 95% CI for \mu is (17.92, 22.26),
Is the average gas mileage less than 25 mpg?
Is the average gas mileage greater than 20 mpg?
A friend of yours wants to play a simple coin-flipping game.
If the coin comes up heads, you win; if it comes up tails, your friend wins.
Suppose the outcome of five plays of the game is T, T, T, T, T.
Is your friend cheating?
A friend of yours wants to play a simple coin-flipping game.
If the coin comes up heads, you win; if it comes up tails, your friend wins.
Suppose the outcome of five plays of the game is T, T, T, T, T.
Is your friend cheating?
We know the probability of flipping a tail is 0.5.
We can compute the probability of flipping five tails in a row. \begin{align*} P[\text{T, T, T, T, T}] &= 0.5 \times 0.5 \times 0.5 \times 0.5 \times 0.5 \\ &= 0.03125 \end{align*}
Is this probability low enough to believe your friend is cheating?
Hypothesis testing: A procedure, based on sample evidence and probability, used to test statements regarding a characteristic of one or more populations.
Steps in hypothesis testing
Make a statement regarding the nature of the population.
Collect evidence (sample data) to test the statement.
Analyze the data to assess the plausibility of the statement.
Note: if we have population parameters available, we do not need to perform a hypothesis test.
Hypothesis: A statement regarding a characteristic of one or more populations.
Null hypothesis, H_0: A statement to be tested.
This is a statement of no change, no effect, or no difference.
It is assumed true until evidence indicates otherwise.
Alternative hypothesis, H_1: A statement that we are trying to find evidence to support.
One sample tests:
Two-tailed test
Left-tailed test
Right-tailed test
The Blue Book price of a used three-year-old Chevy Corvette Z06 is $59,083.
Jamie wonders if the mean price of a used three-year-old Chevy Corvette Z06 in their area is different from $59,083.
What are the null and alternative hypotheses?
Is this a two-tailed, left-tailed, or right-tailed test?
The Blue Book price of a used three-year-old Chevy Corvette Z06 is $59,083.
Jamie wonders if the mean price of a used three-year-old Chevy Corvette Z06 in their area is different from $59,083.
What are the null and alternative hypotheses?
Is this a two-tailed, left-tailed, or right-tailed test?
Two sample tests
Two-tailed test
Left-tailed test
Right-tailed test
Dr. Seals will be buying a car this summer and is doing research to determine what to buy.
She knows that she wants a small car such as a Honda Civic and is willing to purchase in Birmingham, AL if the prices are cheaper.
What are the null and alternative hypotheses?
Is this a two-tailed, left-tailed, or right-tailed test?
Dr. Seals will be buying a car this summer and is doing research to determine what to buy.
She knows that she wants a small car such as a Honda Civic and is willing to purchase in Birmingham, AL if the prices are cheaper.
What are the null and alternative hypotheses?
Is this a two-tailed, left-tailed, or right-tailed test?
We use data to draw conclusions about hypotheses.
If we draw the wrong conclusion, we make an error.
These can be classified as Type I (\alpha) or Type II (\beta) errors.
The Medco pharmaceutical company has just developed a new antibiotic.
A researcher for the Food and Drug Administration wishes to know if the percentage of children taking the new antibiotic who experience a headache as a side effect is more than 2%.
The researcher conducts a hypothesis test with
What does it mean to make a Type I error?
The Medco pharmaceutical company has just developed a new antibiotic.
A researcher for the Food and Drug Administration wishes to know if the percentage of children taking the new antibiotic who experience a headache as a side effect is more than 2%.
The researcher conducts a hypothesis test with
What does it mean to make a Type I error?
A Type I error means that we reject the null when we should not.
Here, that means that the researcher believes p > 0.02 when that is not true.
The Medco pharmaceutical company has just developed a new antibiotic.
A researcher for the Food and Drug Administration wishes to know if the percentage of children taking the new antibiotic who experience a headache as a side effect is more than 2%.
The researcher conducts a hypothesis test with
What does it mean to make a Type II error?
The Medco pharmaceutical company has just developed a new antibiotic.
A researcher for the Food and Drug Administration wishes to know if the percentage of children taking the new antibiotic who experience a headache as a side effect is more than 2%.
The researcher conducts a hypothesis test with
What does it mean to make a Type II error?
A Type II error means that we failed to reject the null when we should not.
Here, that means that the researcher believes p \le 0.02 when it is actually larger than 0.02.
As stated earlier, Type I (\alpha) and Type II (\beta) errors are probabilities.
\alpha = \text{P}[\text{reject } H_0 \text{ when } H_0 \text{ is true}]
\beta = \text{P}[\text{fail to reject } H_0 \text{ when } H_1 \text{ is true}]
We also call \alpha the level of significance.
We should choose \alpha based on the level of error we are willing to withstand in the experiment.
The \alpha that is commonly used is \alpha=0.05.
Sometimes, smaller \alpha is used. e.g., clinical trial \to \alpha=0.01.
For a fixed sample size (n), \alpha and \beta are inversely related.
After stating our hypotheses, we will construct a test statistic.
The choice of test statistic depends on:
The value of the test statistic depends on the sample data.
We will use the test statistic on our way to drawing conclusions about the hypotheses.
After constructing test statistics, we will find the corresponding p-value.
p-value: the probability of observing what we’ve observed or something more extreme, assuming the null hypothesis is true.
Finding a p-value depends on the distribution being used.
We will compare the p-value to \alpha in order to draw conclusions.
To find p-values for right-tailed tests: p = \text{P}[\text{distribution} \ge \text{calculated test statistic}]
To find p-values for left-tailed tests: p = \text{P}[\text{distribution} \ge \text{calculated test statistic}]
To find p-values for two-tailed tests: p = 2 \times \text{P}[\text{distribution} \ge \text{\textit{positive} calculated test statistic}]
Once we’ve found the p-value, we can draw a conclusion.
If p < \alpha, we reject H_0.
If p \ge \alpha, we fail to reject H_0.
Take aways:
We never “accept” the null.
We always interpret in terms of H_1.
The Medco pharmaceutical company has just developed a new antibiotic.
Two percent of children taking competing antibiotics experience a headache as a side effect.
A researcher for the Food and Drug Administration believes that the proportion of children taking the new antibiotic who experience a headache as a side effect is more than 0.02.
When testing H_0: p \le 0.02 vs. H_1: p > 0.02, it was determined that the p-value was 0.017.
Draw the appropriate conclusion at the \alpha=0.05 level.
The Medco pharmaceutical company has just developed a new antibiotic.
Two percent of children taking competing antibiotics experience a headache as a side effect.
A researcher for the Food and Drug Administration believes that the proportion of children taking the new antibiotic who experience a headache as a side effect is more than 0.02.
When testing H_0: p \le 0.02 vs. H_1: p > 0.02, it was determined that the p-value was 0.017.
Draw the appropriate conclusion at the \alpha=0.05 level.
Because p=0.017 is less than \alpha=0.05, we reject H_0.
There is sufficient evidence to suggest that more than 20% of children taking the new antibiotic are experiencing headaches as a side effect.
Hypothesis Test for One Mean, \mu
t.test() function to obtain the information for a hypothesis test for one \mu.Recall the confidence interval, where we determined that the average gas mileage was less than 25.
Let’s formally test that with the one-sample t-test.
Hypotheses
Test Statistic and p-Value
Rejection Region
Conclusion/Interpretation
Reject H_0.
There is sufficient evidence to suggest that the average gas milage is less than 25 mpg.
Recall the confidence interval, where we determined that the average gas mileage was not greater than 20.
Let’s formally test that with the one-sample t-test.
Hypotheses
Test Statistic and p-Value
Rejection Region
Conclusion/Interpretation
Fail to reject H_0.
There is not sufficient evidence to suggest that the average gas milage is greater than 20 mpg.
Hypothesis testing depends on sample size.
As the sample size increases, our p-values decrease necessarily.
As p-values decrease, we are more likely to reject the null hypothesis.
We must ask ourselves if the value we are testing against makes practical sense.
A new weight loss medication where the average amount of weight loss was 1 lb over 6 months.
A new weight loss medication where the average amount of weight lost was 15 lb over 6 months.
A new teaching method that raised final exam scores by 2 points.
A new teaching method that raised final exam scores by 15 points.